A Bilingual Corpus of Inter-linked Events
نویسندگان
چکیده
This paper describes the creation of a bilingual corpus of inter-linked events for Italian and English. Linkage is accomplished through the Inter-Lingual Index (ILI) that links ItalWordNet with WordNet. The availability of this resource, on the one hand, enables contrastive analysis of the linguistic phenomena surrounding events in both languages, and on the other hand, can be used to perform multilingual temporal analysis of texts. In addition to describing the methodology for construction of the inter-linked corpus and the analysis of the data collected, we demonstrate that the ILI could potentially be used to bootstrap the creation of comparable corpora by exporting layers of annotation for words that have the same sense.
منابع مشابه
Tibetan-Chinese Bilingual Sentences Alignment Method based on Multiple Features
Sentence-level aligning bilingual parallel corpus is shown significant and indispensable status in machine translation, translation knowledge acquiring and bilingual lexicography research fields, which is the fundamental work for natural language processing. Given the great deal of work in sentence alignment and a variety of methods have developed for bilingual terminology extraction, those are...
متن کاملEvaluating Compound-to-compound Links in a Sub-sentence Aligned Bilingual Corpus through Example-based Element Recognition
This paper will present an algorithm that evaluates links between one-word compounds and two-word compounds in a bilingual corpus that has been aligned at the sub-sentence level. The phenomenon of linking one-word compounds to multi-word compounds is common when English is being linked to other Germanic languages, and it is difficult to get the links right in the alignment process. The algorith...
متن کاملX-Linked Lissencephaly with Absent Corpus Callosum and Ambiguous Genitalia: A Case Report
Background: X-linked lissencephaly with ambiguous genitalia (XLAG) is a recently described genetic disorder, in which patients present with lissencephaly, agenesis of the corpus callosum, refractory epilepsy of neonatal onset, acquired microcephaly, and male genotype with ambiguous genitalia. XLAG is responsible for a severe neurological disorder of neonatal onset in boys. A gyration defect con...
متن کاملJoint search in a bilingual valency lexicon and an annotated corpus
... so I say to you ... search, and you will find ... In this paper and the associated system demo, we present an advanced search system that allows to perform a joint search over a (bilingual) valency lexicon and a correspondingly annotated linked parallel corpus. This search tool has been developed on the basis of the Prague Czech-English Dependency Treebank, but its ideas are applicable in p...
متن کاملBilingual Dictionary Extraction from Wikipedia
The way of mining comparable corpora and the strategy of dictionary extraction are two essential elements of bilingual dictionary extraction from comparable corpora. This paper first proposes a method, which uses the interlanguage link in Wikipedia, to build comparable corpora. The large scale of Wikipedia ensures the quantity of collected comparable corpora. Besides, because the inter-language...
متن کامل